Context-sensitive statistical language modeling

نویسندگان

  • Alexander Gruenstein
  • Chao Wang
  • Stephanie Seneff
چکیده

We present context-sensitive dynamic classes – a novel mechanism for integrating contextual information from spoken dialogue into a class n-gram language model. We exploit the dialogue system’s information state to populate dynamic classes, thus percolating contextual constraints to the recognizer’s language model in real time. We describe a technique for training a language model incorporating context-sensitive dynamic classes which considerably reduces word error rate under several conditions. Significantly, our technique does not partition the language model based on potentially artificial dialogue state distinctions; rather, it accommodates both strong and weak expectations via dynamic manipulation of a single model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic modeling and language modeling for cantonese LVCSR

This paper describes our recent work on the development of a large-vocabulary, speaker-independent continuous speech recognition system for Cantonese (a major Chinese dialect). Both acoustic modeling and language modeling are being addressed. For acoustic modeling, we focus on right-context-dependent sub-syllable units. Tying of HMM at model as well as state level is applied based on phonetic k...

متن کامل

Stochastic k-Tree Grammar and Its Application in Biomolecular Structure Modeling

Stochastic context-free grammar (SCFG) has been successful in modeling biomolecular structures, typically RNA secondary structure, for statistical analysis and structure prediction. Context-free grammar rules specify parallel and nested co-occurren-ces of terminals, and thus are ideal for modeling nucleotide canonical base pairs that constitute the RNA secondary structure. Stochastic grammars h...

متن کامل

Statistical Modeling of Pronunciation Variation by Hierarchical Grouping Rule Inference

In this paper, a data-driven approach to statistical modeling pronunciation variation is proposed. It consists of learning stochastic pronunciation rules. The proposed method jointly models different rules that define the same transformation. Hierarchic Grouping Rule Inference (HIEGRI) algorithm is proposed to generate this model based on graphs. HIEGRI algorithm detects the common patterns of ...

متن کامل

Word Sense Disambiguation for Statistical Machine Translation

While much effort has been put in designing and evaluating Word Sense Disambiguation (WSD) models for translation in the WSD community, standard Statistical Machine Translation (SMT) systems have achieved remarkable improvements in translation quality without modeling WSD explicitly. However, inspecting SMT output suggests that SMT needs better semantic modeling to accurately translate meaning....

متن کامل

Improving Language Modeling by Combining Heteogeneous Corpora

In applying statistical language modeling, directly adding training data (e.g. from website) may not always improve the performance of language models because the data may not be suitable for the application or contain errors. This paper presents a method of combining multiple heterogeneous corpora to improve the resulting language models, called compressed context-dependent interpolation schem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005